Data Science at the Command Line: Facing the Future with Time-Tested Tools (Paperback)

Data Science at the Command Line: Facing the Future with Time-Tested Tools (Paperback)

作者: Jeroen Janssens
出版社: O'Reilly
出版在: 2014-10-28
ISBN-13: 9781491947852
ISBN-10: 1491947853
裝訂格式: Paperback
總頁數: 212 頁




內容描述


This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You’ll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data.
To get you started—whether you’re on Windows, OS X, or Linux—author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools.
Discover why the command line is an agile, scalable, and extensible technology. Even if you’re already comfortable processing data with, say, Python or R, you’ll greatly improve your data science workflow by also leveraging the power of the command line.

Obtain data from websites, APIs, databases, and spreadsheets
Perform scrub operations on plain text, CSV, HTML/XML, and JSON
Explore data, compute descriptive statistics, and create visualizations
Manage your data science workflow using Drake
Create reusable tools from one-liners and existing Python or R code
Parallelize and distribute data-intensive pipelines using GNU Parallel
Model data with dimensionality reduction, clustering, regression, and classification algorithms




相關書籍

基於機器學習的聲發射信號處理算法研究

作者 周俊 朱文耀 王超

2014-10-28

B端產品經理修煉手冊 : AI產品規劃與商業落地

作者 李博

2014-10-28

Build a Career in Data Science

作者 Nolis Jacqueline Robinson Emily

2014-10-28